Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics

نویسندگان

Rico Petrick

Xugang Lu

Masashi Unoki

Masato Akagi

Rüdiger Hoffmann

چکیده

This paper proposes two methods for robust automatic speech recognition (ASR) in reverberant environments. Unlike other methods which mostly apply inverse filtering by blindly estimated room impulse responses to achieve dereverberation, the proposed methods are based on the utilization of the characteristics of speech. The first method Harmonicity based Feature Analysis – takes advantage of the harmonic components of speech, which are assumed to be undistorted. The second method Temporal Power Envelope Feature Analysis – utilizes the temporal modulation structure of speech, representing the phoneme level temporal events which contain most intelligibility information. Both methods increase the recognition performance remarkably in a different way. Combining both of them connects their individual advantages. In order to examine the performance of utilizing harmonicity and modulation temporal structure for reverberant ASR, the methods are tested in clean and reverberant training. As results show, even in strong reverberant conditions both methods obtain practical applicable performance for reverberant training. In addition, besides testing their performance in dependency on the reverberation time, their performance considering the speaker-to-microphone distance is tested, which is another new contributions in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Two-Channel Acoustic Front-End for Robust Automatic Speech Recognition in Noisy and Reverberant Environments

An acoustic front-end for robust automatic speech recognition in noisy and reverberant environments is proposed in this contribution. It comprises a blind source separation-based signal extraction scheme and only requires two microphone signals. The proposed front-end and its integration into the recognition system is analyzed and evaluated in noisy living room-like environments according to th...

متن کامل

Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique

In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in ad...

متن کامل

Perceptually Inspired Signal-processing Strategies for Robust Speech Recognition in Reverberant Environments

متن کامل

An MTF-based blind restoration of temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments

To reduce speech degradation in reverberant environments, we previously proposed a modulation transfer function (MTF) based method of speech restoration. The room impulse response (RIR) in this restoration does not need to be measured at any time since we modeled the power envelope of the RIRs as an exponential decay function. Speech is assumed to be temporal modulated with white noise carrier ...

متن کامل

Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum

The goal of this work is to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments. In this paper we present a compressive gammachirp filter-bank-based feature extractor that incorporates a method for the enhancement of auditory spectrum and a shorttime feature normalization technique, which, by adjusting the scale and mean of cepstral feat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics

نویسندگان

چکیده

منابع مشابه

A Two-Channel Acoustic Front-End for Robust Automatic Speech Recognition in Noisy and Reverberant Environments

Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique

Perceptually Inspired Signal-processing Strategies for Robust Speech Recognition in Reverberant Environments

An MTF-based blind restoration of temporal power envelopes as a front-end processor for automatic speech recognition systems in reverberant environments

Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum

عنوان ژورنال:

اشتراک گذاری